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^ ' Abstract 

, A version of the time-parallel algorithm parareal is analyzed and 

(-H I applied to stochastic models in chemical kinetics. A fast predictor at 

the macroscopic scale (evaluated in serial) is available in the form of 
the usual reaction rate equations. A stochastic simulation algorithm 
I is used to obtain an exact realization of the process at the mesoscopic 

^ ' scale (in parallel). 

^ . The underlying stochastic description is a jump process driven by 

[ the Poisson measure. A convergence result in this arguably difficult 

^ ' setting is established suggesting that a homogenization of the solution 

^T) • is advantageous. We devise a simple but highly general such technique. 

, Three numerical experiments on models representative to the field 

' of computational systems biology illustrate the method. For non-stiff 

■ problems, it is shown that the method is able to quickly converge even 

• • , when stochastic effects are present. For stiff problems we are instead 

. ^ I able to obtain fast convergence to a homogenized solution. 

^ ' Overall, the method builds an attractive bridge between on the 

^ . one hand, macroscopic deterministic scales and, on the other hand, 

mesoscopic stochastic ones. This construction is clearly possible to 
apply also to stochastic models within other fields. 

Keywords: Parareal, reaction rate equations, jump process, homog- 
enization, next reaction method, stochastic reaction-diffusion. 

AMS subject classification: 65C40, 60J75, 60J22, 60H35, 68W10. 



* Financial support has been obtained from the Swedish National Graduate School in 
Mathematics and Computing and the Swedish Foundation for Strategic Research. 



1 



1 Introduction 



It has been shown in several studies [l3|, l36|, l38|, |40| that stochastic descrip- 



tions of biochemical networks are necessary for understanding and explain- 
ing the mechanisms inside living cells. Such networks often contain species 
present in copy numbers down to a few hundreds 25|], hinting why discrete- 
ness and stochasticity becomes important. Examples of when random ef- 
fects are pronounced include the regularity and drift of circadian oscillations 



46[ |. spontaneous separation in bistable systems and the creation of new 
steady-states 45| . 

An accurate such stochastic model can be obtained directly from micro- 
physical considerations as a consequence of the Markov property 23j . Essen- 
tially, in a diffusion-controlled system, reactive collisions between molecules 
are rare events, implying a rapid loss of dependence on the systems history. 

Sample realizations of any given chemical network can be obtained by 
randomly firing reactions and correspondingly updating the systems state; 
the most well-known such algorithm is Gillespie's stochastic simulation al- 
gorithm (SSA) [i^]. Although conceptually simple, detailed simulation of 
complex reaction networks as found for example in living cells remains a 
computationally very intensive problem. 

The parareal algorithm for the solution of evolution equations on a time- 
interval [0, T] was suggested in the note |33| as a method for the parallel 
solution of problems in real time. An improved version, better tuned to 
nonlinear problems, was subsequently applied to problems in control theory 



351 ] and molecular dynamics [Sj] to mention just a few. 

The method is built around a predictor-corrector step in which a coarse 
solver is used as a preconditioner to a fine solver. It thus incorporates both 
multigrid and domain decomposition ideas, but is unique in that it is a purely 
parallel algorithm; it has no value when executed on a serial machine. 

The idea suggested in this paper is to use the macroscopic approximation 
of the chemical system, the reaction rate equations, as the coarse solver while 
a stochastic simulation is used as the fine scale solver. This setup is available 
also for other types of stochastic evolution equations but has not received 
much attention. The reason is probably that any deterministic model is a 
quite poor formal approximation. On the other hand, any macroscopic trend 
in the solution is likely to be quickly obtained. Also, with a deterministic 
coarse solver, highly efficient implicit integrators are available which greatly 
simplifies the implementation. 

The present paper is related to references [sl, [s^]. In the former, the anal- 
ysis of parareal when applied to ordinary differential equations (ODEs) and 
stochastic ODEs (SDEs) driven by Wiener processes is treated. In the lat- 
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ter, the parareal algorithm is applied to linear mass-balance (deterministic) 
chemical kinetics with disparate rates by using a coarse solver in the form of 
a reduced model. 

The paper is divided into sections as follows: in Section [2] the mathemat- 
ical model of stochastic kinetics is described in some detail and we briefly 
review the parareal algorithm in Section [SI We present a convergence anal- 
ysis in Section IH the main result indicates that strong convergence might 
be slow for systems with large Lipschitz constants. A way around this is 
suggested in the form of a simple but very general homogenization procedure 
in parallel. This "parareal homogenization" could well be an interesting 
approach to many other stiff problems. We perform three quite different nu- 
merical experiments in Section El where the practical implementation of the 
algorithm is also discussed. Finally, in Section [U] we conclude the paper by 
discussing the outcome of the experiments and suggesting some future work 
and generalizations. 



2 Stochastic chemical kinetics 

In this section we discuss the usual mesoscopic stochastic model for chemical 
reactions (the master equation). As it is needed in the analysis in Section HI 
we shall also give an equivalent and mathematically more appealing repre- 
sentation in terms of a certain stochastic ium p p rocess. For monographs on 
applications of the master equation, see llsL l29|. The mathematical treat- 
ment of jump processes can be found in [2|, |9|, [l5|, l2l|, |27| . 

2.1 Chemical reactions, the master equation and a hi- 
erarchy of methods 

We consider in this paper a general homogenous chemical network consist- 
ing of D different species reacting according to R prescribed reactions. At 
any given time t, the state of the system is an integer vector x G Z;^ = 
{0, 1,2,.. .}^ counting the number of molecules of each kind. In a stochas- 
tic description, each reaction is a change of the state according to a certain 
transition rule and the intensity of this process is governed by a reaction 
propensity, Wr ■ R+. This is the transition probabihty per unit of 

time for moving from the state x to x — N,.; 

X — (2.1) 

where by convention, Nr G is the transition step and is the rth column 
in the stoichiometric matrix N G Z^^^. 
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For instance, in a given well-stirred volume the elementary reactions 
can be written using the states x = [a, b]"^, 



A, 



A 



k2a 



A + A — - — 

. _ kiab/Q 

A + B 



The propensities are generally scaled such that 

Wr{x) = QUr{x/Q) 



(2.2) 
(2.3) 
(2.4) 
(2.5) 

(2.6) 



for some function Ur which does not depend on Q. Intensities of this form 
are called density dependent and arise naturally in a number of situations 



15|, Ch. 11]. 

Arguably the most common stochastic description of chemically reacting 
systems is the chemical master equation (CME). To state it, let p{x, t) be the 
probabilit y th at a certain number x of molecules is present at time t. The 
CME [l8|,l29| is then given by 



dp{x, t) 
dt 



R 



^ Wr{x + Nr)p{x + Nr,t) - ^ Wr{x)p{x,t), (2.7) 



>0 



-N+>0 



where the transition steps are decomposed into positive and negative parts 
as = N+ + N-. 



In a pioneering paper from 1976, Gillespie [22| showed that exact samples 
from the master equation can be obtained via the SSA. This follows essen- 
tially by identifying it with the forward Kolmogorov equation associated to a 
certain continuous-time Markov chain to be introduced shortly. Importantly, 
Gillespie 23| later also showed that the CME is an exact physical description 
when the system is well-stirred and in thermal equilibrium. 

For chemical networks where the number of reactions per unit of time 
is large, detailed simulation with SSA becomes inefficient. As a remedy, 
Gillespie |2J] proposed the tau-leap method with the ability to simultaneously 
"leap" over many reactions at once. The method has since been developed 
and improved by several authors [H, [3l|, 42 . 



A related issue is stiffness: many interesting models become prohibitively 
expensive to solve by explicitly simulating the various involved scales. Several 
different model reduction techniques have been proposed for this situation 0, 



4 



11 



26l |. Also, an implicit version of the tau-leap method has been developed 



41], but this method converges in a very weak sense only (^32. 42 



An alternative approximation is the Langevin equation [15, Ch. 7] which 
is a continuous SDE driven by Wiener processes. On ignoring the noise 
term and only keeping the drift, the reaction rate equations are obtained. 
This is a set of D ODEs approximately evolving the expectation value of the 
mesoscopic model. The hierarchy of solution methods (SSA tau-leap 
Langevin SDE ODE) can thus be thought of as a transition between the 
mesoscopic model and the macroscopic one (24| . 



2.2 Sample path representation 

A representation equivalent to, but perhaps more direct than the master 



equation (12.71) . was proposed fairly recently in [39|]. The idea is to construct 



a sample path representation in the form of jump SDEs driven by Poisson 



processes. As we shall see and as pointed out in |3ll . l39l | , this representation 
is particular useful for numerical analysis. 

We thus introduce the stochastic variable X(t) G counting at time t > 
the number of molecules of each type. Reactions are understood to occur 
instantly and thus the process is right continuous only; the notation X{t—) 
is used to denote the value of the process prior to any reactions occurring at 
time t. We assume the existence of a probability space0 (S, F, P) with the 
filtration Ft>o containing i?- dimensional standard Poisson processes. The 
transition probabilities in (12. ip define counting processes iir (t) 0, Ch. 2.5] 
according to 

E [7Cr{t + dt) - 7r^(t)|Ft] = Wr{X{t-)) dt + o{dt). (2.8) 
In turn, the counting processes determine the process X{t) [isl, Ch. 6.4], 

R 

Xi = Xo-^N,7r,(t). (2.9) 

r=l 

A representation of T^r{t) in terms of a unit-rate Poisson process 11^ in an 
operational or scaled time can also be given 0, Ch. 3.1, 4.4], 

7r,.(t) = Hr (^j^ Wr{X{s-)) ds^ . (2.10) 



^Note that, throughout the paper, the letter Q. is not used for the probabihty sample 
space, but consistently denotes the system volume. 
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We now wish to move beyond fl2.9l) and obtain a more transparent repre- 
sentation in the form of a jump SDE driven by the proper measure. Processes 
with nonhnearly dependent jump intensities are quite difficuh to handle since 
the noise enters with a comphcated dependence on the state. This is in con- 
trast to the more famihar Ito SDE driven hnearly by Wiener processes. 

For the present purposes, the Poisson random measure |28l . Ch. II. Ic] 
^{dt X dz; a) with cr G S defines an increasing sequence of arrival times 
Ti G R+ with corresponding "marks" Zi uniformly distributed in [0, W] 0, 
Ch. 1.4], where W is to be defined shortly. The deterministic intensity of 



fi{dtxdz) is the Lebesgue measure, ■m{dtxdz) = dtxdz [2l|, Part II, Ch. 2§4]. 

The propensities have been general nonlinear functions up to this moment 
but we need to impose conditions so as to bound the total intensity of all 
reaction channels. It is not difficult to see that a finite intensity implies a 
bounded solution in the mean square sense for finite times (see Lemma 14.21 
below). For convenience, we take the approach in [sil] and formally specializes 
the investigation to closed systems. This implies the existence of a bounded 
set S containing at any time t the state of the system and is reasonable 
from physical considerations; — any real application must necessarily involve 
a bounded total number of molecules. 

Summarizing, we thus have 

Assumption 2.1. The state of the chemical network satisfies X{t) G S* C 
Z^. When confined to this set, all propensities are Lipschitz continuous 
in their argument with respect to the Euclidean norm, \wr{x) — Wr{y)\ < 
— ?/||. It follows that all propensities are bounded such that 



Wr ■.= ^maxwsix) (2.11) 



x£S 
s=l 



exists and is bounded for all r. We define also the numbers L = J2r^r, 
W = Wr and W = max^^s Er ^^(a;). Note that W <W. 

The frequency of each reaction is controlled through a set of indicator 
functions Wr ■ S x [0, W] {0, 1} defined as follows: 

1 ifO< z-Wr-i<Wrix), .2 12) 

otherwise. 

We can now represent the counting process fl2.10p in terms of the Pois- 



son random measure 28|, Ch. II. Id] via a thinning of the measure using 
an acceptance-rejection strategy [ol, Ch. 4.3]: 

Tir{t)= j j WriX{t-); z) fx{dtx dz). (2.13) 
Jo Jo 



6 



Note that this thinning is not the same as in 3l|, l39| where instead z G 
[0,H^]; the present construction yields sharper estimates when comparing 
trajectories formed from different initial data. 

The representation f l2.13p combined with (12.91) leads us to the sample 
path representation in terms of a Skorohod jump SDE [27', Ch. IV.9], 



R 



dXt = -Y^ N,. / Wr{X{t-)] z) fx{dt X dz). (2.14) 

Often one is interested in separating fl2.14p into its "drift" and "jump" terms 
21I, Part II, Ch. 2§4], 

dXt = -^NrWriX{t-))dt -^Nr / WriX (t-)] z){fx - m){dt X dz) 

=:dXD{t) + dXj{t). (2.15) 

We will use the initial value convention X]j{0) = X{0) and Xj{0) = 
throughout the paper. Note that the second term in (12.151) driven by the 
compensated measure (/i — m) is a martingale of mean zero. On taking 
expectation values and approximating, we obtain dEXf ^ dx{t) where 

R^ 

dx{t) = - ^ NrWr{x{t)) dt (2.16) 



r=l 



and a;(0) = EX{0). This approximation constitutes, up to a scaling by the 
system volume fl, the usual macroscopic reaction-rate equations. 



3 The parareal algorithm 



We continue by giving a short account of the parareal algorithm. For a quick 
introduction with additional references the survey 43|] can be recommended. 



171 . |4J] for abstract convergence results, stability analysis and 



See also 
more. 

Consider the general time-dependent problem written in operator form, 
M = -Au, t e [0, T] with u(0) = uq. (3.1) 
Write the solution propagator as 



Hy) = y- f 

Jo 



Au{t) dt, 



(3.2) 
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where u solves fl3.1l) subject to u{0) = y. The parareal algorithm assumes 
that a coarse solver C is available that does the same thing but faster and 
presumably less accurate. 

Time is now discretized in = T/At chunks of time and any solver 
S G {J^At,CAt} can be used to compute a numerical solution: 



(3.3) 



or simply B{S)v = Uq in matrix notation. The parareal algorithm emerges as 
the fix-point iteration obtained by using B{CAt)~^ as an approximate inverse 
to B{J^^t): 



I 








" 




^'0 




Uq 


-s 


/ 










Vl 










-s 


/ 







V2 













-s 


/ 




V3 








Vk+l 



(3.4) 



Let fo,o = uo and fo,n = CAtVo,n-i for n G [1,A^] to start up the algorithm. 
One readily verifies the recursion 



Vk,n — ^AtVk-l,n-l — [CAtVk-l,n-l — CAtVk,n-l\ 



(3.5) 



where the evaluation of JF is trivially parallel. 

At any point in the algorithm, the preconditioned residual is given by 

Pk = Vk-Vk+i, (3.6) 

and is useful as an estimate of the error. 

A nice survey of the parareal method is found in 17| where the connec- 
tion to some earlier methods is made. It turns out that one can view the 
process as a Newton method applied to a certain nonlinear problem using a 
Jacobian approximated by finite differences. This partially explains the su- 
perlinear convergence often observed on bounded intervals. It is known that 
the parareal algorithm converges poorly for problems where the operator has 
eigenvalues with large imaginary parts ^] . Another interesting result is that 
the parareal scheme can be made unconditionally stable by using a suitable 
relaxation of the coarse solver 



4 Analysis 

In this section we analyze the parareal algorithm obtained by using the rate 
equations (12.161) and the jump SDE fl2.14p as the coarse and fine solver, 
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respectively. In Section 14.11 we give some results that cover local properties 
of solutions to these equations. The actual convergence analysis is found 
in Section 14.21 where convergence in mean square and convergence of the 
first moment is investigated. As noted in j^, weak convergence is not so 
interesting for the parareal algorithm since weak estimates typically come into 
play when many trajectories are simultaneously computed in a Monte Carlo 
simulation; parallelism can then be trivially achieved with optimal efficiency. 
Our interest in the convergence of the first moment mainly stems from the 
possibility to homogenize the model on the fiy by a suitable averaging filter. 
This idea is discussed in Section 14.31 

Because the process X{t) is integer valued, we mention here for clarity 
the trivial inequality 

\X\<X^^E\X\<EX^. (4.1) 

When X is large, (14. ip is an overestimate and we will switch to the standard 
inequality 

E\X\ < {EX'^f'^. (4.2) 

For simplicity and without serious loss of generality, we treat the 1-D case 
only. 



4.1 Preliminaries 

We start by citing the following convenient lemma. 

Lemma 4.1 (Lemma 3.1 and 3.2 in [3l|; see also Lemma 2.3.2 in [it]). Define 
/S.Xt = X{t) — X{t—). Then for any fixed time s > 0, AXg = (a.s.). 
Furthermore, if w is a continuous function and ti <t2, then 

ti 

Aw{Xt)dt = 0. (4.3) 

L 

We continue by establishing two lemmas concerning the stability of the 
jump process X{t). 

Lemma 4.2. Let X{t > 0) evolve according to f l2.14p and define the jump 
term Xj{t) as zn (127151) with Xj{0) = 0. Then 

EXj{tf < (4.4) 

Furthermore, 

E[Xt - Xof < 2\\N\\'^Wt + 2||N||^iy V. (4.5) 
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Proof. The left side of f l4.4l) can be written 

J J ^^-^rWr{X{s-)] z){ii-m){dsx dz)\ 

= Ej J \y2-NrWr{X{s-)] z)j m{dsxdz) 

<E / \\Nfy2'^riX{s-); zfm{ds X dz), 
Jo Jo 

where the Ito isometry for jump processes was used 21, Part II, Ch. 2§5] [2I, 
Lemma 4.2.2]. Using that w'^ = Wr we complete the proof of (14.41) . For (14. 5 p 
we use the drift/jump spht (12.151) and the inequality (a + bY < 2a^ + 26^ to 
obtain 

E[Xt - Xo]^ < 2EXj{tf + 2E ^ -nrWr{X{s-)) rfsj . 

□ 

Lemma 4.3. Let the drift term he defined as in (12.151) with ^^(O) = X{0) 
and let x{t) be the reaction rate approximation (I2.16P to the expected value 
ofXit) with x(0) = X(0). Then 

E[XD{t) - x{t)f < L^IINII Wexp(2LlN||V). (4.6) 

Proof. By Jensen's inequality and the Lipschitz boundedness we get 

E[XD{t) -x{t)]^ < L^\\N\\HE [\x{s)-x{s)]^ds 

Jo 

<L^\\N\\h I 2EXj{sf + 2E[Xd{s) - x{s)f ds 
Jo 

< L2||N||W + 2L2||Nft f E[Xd{s) - x{s)f ds, 

Jo 

where the first part of Lemma 14.21 was used. The result now follows as an 
application of Gronwall's inequality. □ 

Remark. In a closed system we can identify the magnitude of X with the 
system volume f2; maxa;g5 ||x|| = 0(f2). For density dependent propensi- 
ties (see (12. 6p ). this implies that W ^ VL while L does not scale with VL. 
Lemma 14.21 and 14.31 then yields 

{E[X{t)-i{t)fy'^ <Ctn^/^ (4.7) 



10 



for some bounded constant Ct- By Chebyshev's inequality we thus have 
convergence in probabihty Q~^X{t) Q^^x{t) a.s Q —>■ oo. For more precise 



reasoning of this type, see [15 



We proceed with a resuh that is crucial to the convergence properties of 
the parareal algorithm. 

Theorem 4.4. Let X{t > 0) and Y{t > 0) evolve according to (12.141) using 
the same underlying set of Poisson processes but with different initial values 
Xq and Yq. Define the jump terms Xj{t) and Yj{t) as in Lemma \4 ■ S\ Then 

E [Yj{At) - Xj(At)]' < L||Nf At|yo - Xo|(l + O (At)). (4.8) 

Proof. By the Ito isometry we can bound the left side of (14. 8 p by 

Ej / \M\^y^[wr{Y{t-)] z)-Wr{X{t-)] z)fm{dt^dz) 
Jo Jo 



r=l 

.At R 



<E||N||2 / J2\WriY{t-)) - WriXit-))\dt, 
Jo r=l 



where the usefulness of the thinning proposed in Section [2^ is evident. From 
the Lipschitz assumption and Lemma 14.11 we get the bound 

rAt pAt 

<L||Nf / E\Y{t-)-X{t-)\dt = L\\nf E\Y{t)-X{t)\dt 
Jo Jo 

/■At 

< Lpll^At \Yo - Xo\ + L\\Nf / E \Y{t) -Yo\+E \X{t) - Xo\ dt 

Jo 



< LmrAtlYo -Xn\ + 4L||Nr / wt + WH' dt 



At 

2.2 



where the integer inequality (14. ip and the second part of Lemma 14.21 was 
used in the last line. □ 



The following result, similar in spirit to Theorem l4.4[ is useful for studying 
the error in the first moment alone. 

Theorem 4.5. Define Y£){t), y{t), X]j{t) and x{t) as in Lemma "JTS. Then 

\EYD{At) - y{At) - EXoiAt) + i{At)\ < O {At^^^) \E[Yo~Xo]\. (4.9) 

Proof. If Yo = ^0 there is nothing to prove and we may thus assume that 
llo ~ -^o| > 1- Two applications of Lemma [4.31 gives that 

YoiAt) - y{At) - XoiAt) + x{At) = O [A^'^) . 

We recover (14.91) by inserting the factor (Yq — -^o); taking expectation and 
absolute values of both sides. □ 
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4.2 Convergence of the parareal algorithm 

For the application of the parareal algorithm we let Ty be the stochastic 
evolution in a time-step At of the jump process X{t) obeying fl2.14p with 
initial data y. Following (12. 151) . we write JF = JF^ + jFj for the drift- and 
jump term respectively. The coarse solver Cy is instead the evolution in a 
time-step At using the deterministic rate equations fl2.16p starting from y. 

We denote by X = [Xq, . . . , X^v] with X„ = X(t„) the exact solution at 
times tn = nAt. Xk^n ~ X„ is the numerical approximative solution obtained 
after the kill iteration of the parareal algorithm: 

Xfc,o = Xo, k> 0, (4.10) 
Xo,„ = CXo,„_i, n > 1, (4.11) 

Xk,n = ^Xk-l,n-l — CXk-l,n-l + CXk^n-1-, (4-12) 

where {k,n) > 1 in fHl2l) . 

We now wish to analyze the root-mean-square (RMS) error defined by 

el^ = E[Xk,n-X^f. (4.13) 
This measure satisfies by f l4.12l) the recursion 

= E[T, + T2 + T,f 

= E [Tl + r| + Tl + 2T1T2 + 2T1T3 + 2T2T3] , (4. 14) 
where in terms of 

Tl = TjXk^l^n-l — ^.jXn-i, (4-15) 

T2 = ToXk-l^n-l — CXk^l^n-1 — ^oXn-l + CXn-l-, (4.16) 

T^ = CXk,n-i-CXn-i. (4.17) 

By Theorem 14.41 using (14. ip and Lemma 14.31 we get 

ETl < L||N|rAteLi,„_i(l + O (At)), (4.18) 
ET| < O (At^) . (4.19) 

We also have the Lipschitz bound 

ETi<eMmmmeln-i- (4.20) 
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As for the cross-terms we have that ETiT^ = since Ti is a martingale of zero 
mean and T3 is a deterministic (non-anticipating) function. The remaining 
terms can be easily estimated as 

2ET1T2 < AtET^ + Af-^ET^ < O {Af) e\^_^, (4.21) 
2ET2T3 < Ar^ET^ + AtETl < O [At^) + O {At) e\^_^. (4.22) 

Summarizing, we have from f l4.14p upon ignoring higher order terms in At, 

el^ < L||N|pAteti,„_i + exp(2L||N|| At)e^,„_, 

= : AtS'^eU^^_, + exp{2AtSc)el^_,. (4.23) 

Define M = max„ eo,n in view of fl4.7p . We readily recognize the binomial 
recurrence in (14.231) so that 

el^ < M^(j^At''Sfexp{2AtSc{n - A;)). (4.24) 

If we now look at iteration k = O (1) and interval n = N = T/At, (I4.24p can 
be simplified into 

ek,n < C^,tS%, (4.25) 

where Ci^t is a bounded constant for any given total time T. Eq. f l4.25p 
shows mean square convergence for contractive problems where Sj: < 1 only; 
it is unclear what happens for systems with larger Lipschitz constants. A 
key to understanding how the analysis can be refined lies in the fact that 
the integer inequality (14. ip had to be applied before using Theorem 14.41 and 
arriving at (I4.18p . For an initial large error, the inequality (14. 2 p is sharper 
but introduces a nonlinear dependence: 

„ < AtSleu-i,n-i + exp(2At5c)e^ (4.26) 

This motivates our interest in the next proposition. 

Proposition 4.6. Consider for {k,n) > a sequence ak,n of nonnegative 
numbers. Let akfi = for k > and suppose that a^ ^ < C for n > and 
some nonnegative constant C . Then for k <n the inequality 

1/2 

where A> and B > 1 is satisfied by 

9 9I — fc 9 9I — k I. 

ak,n<A^ B'^-^n-kf . (4.27) 
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A proof by induction is easily constructed once the functional form in fl4.27l) 
has been obtained. 

To apply Proposition l4.6l we put A = AtSjr, B = exp(2AtSc) and identify 
Gk,n = (^kn ^^sX C = M^. The result is for k <^ n = N that 

ek,n < C2,tM^~\ (4.28) 

for some constant C2,t- During the first few iterations, this bound is sharper 
than fl4.25p which is only valid when the error is already small. For instance, 
if as in gTj) we have that M ~ Vt^/'^, then ek,N ~ [^^/^ . . .] for 

the first few iterations k = [0, 1, 2, . . .]. 

As it provides us with additional insight, we also comment on the error in 
the first moment. Define for this purpose Ck^n = \E[Xk^n — Xn\\- Proceeding 
as before and using Theorem 14.51 it is not difficult to see that this error 
satisfies 

ek,n < O (At3/2) ek-i,n-i + exp(L||N|| At)efc,„_i 

< {^^At^^'^ <C^^T^t^/\ (4.29) 

The estimates fl4.25p . f l4.28p and fl4.29p show that for At sufficiently small, 
the contractive parts converge in mean square while the non- contractive parts 
converge in the sense of fl4.29p only. This interpretation opens up for modi- 
fying the scheme in such a way that the fast scales are actively filtered away. 



4.3 Homogenization 

The previous analysis suggests that problems with large Lipschitz constants 
may be problematic to solve by the proposed parareal algorithm. On the 
other hand, for certain problems involving very rapid scales, pathwise con- 
vergence may not be so interesting to obtain. Rather are we interested in 
convergence to a homogenized model which by itself is a weak approxima- 
tion to the original system. For example, this situation occurs in the current 
context when rapid reaction channels almost balance each other so that the 
interesting dynamics occurs on a much slower scale. 

Unlike the deterministic case where implicit solvers generally evolve stiff 
problems efficiently, it has been proposed that model reduction techniques 



are necessary in the stochastic setting [32]. For instance, the implicit tau- 



leap method has only been shown to be weakly convergent under the rather 
strong assumption of linear propensities js], |42[ . 

Provided that the coarse solver is sufficiently dissipative, the biggest chal- 
lenge in directly using the current parareal algorithm for stiff problems is to 
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detect when the numerical solution is of sufficient quality; since convergence 
is at best weak, any jumps on the order of the noise associated with the fast 
scale could in principle be accepted. 

An easy way around this is to replace the fine propagator T with a ho- 
mogenized version . A simple example is 



i.e. an average of the exact trajectory over an interval bt. This interval 
should be large enough to contain several fast reactions but short enough to 
be essentially independent on the slow scales. One can easily think of much 
more advanced filters to create a suitable homogenization but we settled for 
the immediate (14.301) . 

Let us remark that an essentially "exact" trajectory is available within 
the homogenized solution since the non-averaged solution is always com- 
puted ffist. This "exact" trajectory is just as noisy as the true solution but 
allows for jumps related to the fast scale in between successive time intervals 
[tn,^n+i]- An intuitive interpretation is that the homogenized trajectory is 
an exact sample from a nearby model containing additional reactions that 
are scheduled at deterministic time-steps At. In principle, the intensity of 
this unknown process could be estimated and be put in relation to the rest 
of the propensities. 

5 Numerical examples 

After discussing the implementation of the proposed method, we will consider 
three numerical examples. The ffist is a typical example of when stochastic 
models are necessary since mean-field equations generally give incorrect re- 
sults. Nevertheless, when they are used as a preconditioner in the parareal 
algorithm, convergence to the true solution is quite fast. The second example 
is representative of situations involving multiple scales with fast transients. 
Finally, we obtain a trajectory from the reaction- diffusion master equation 
which is a good representative of very large networks. Here, the macroscopic 
model is just the familiar reaction-diffusion partial differential equation. 

5.1 Implementation 

For the purpose of performing experiments we have implemented a serial 
version of the parareal algorithm in Matlab. The coarse solver is thus simply 
the reaction rate equations solved by any suitable ODE-solver; we typically 




(4.30) 
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used Matlab's ode23s but occasionally got away with just a single step of 
the backward Euler method. As the fine solver we used an implementation of 
the next reaction method (NRM) , which is statistically equivalent to the 
SSA but specifically tuned to large networks. The reason for preferring this 
particular implementation is that consistent Poisson processes may easily 
he sampled. This feature is achieved by simply using the same sequence of 
random numbers for each reaction channel. 

To appreciate where this technicality enters in the analysis, note that in 
Theorem 14.41 Yt and Xt are assumed to be trajectories conditioned on the 
same Poisson process. If they are not related in this respect, the associated 
constants are much larger. 

In order to estimate the quality of the numerical solution we used the 
relative Euclidean norm of the preconditioned residual as given by (13.61) : 

Cfc+i ^maxD'^/'^\\{Vk,n-Vk+l.n)/{^+Vk+l.n)\\, (5.1) 
n 

(elementwise division). This was then taken as an approximation to the true 
relative error 



max D-^l^\\ {vk+i,n - m„)/(1 + Wn) || • (5.2) 

n 



We generally did not round any fractional results obtained from the coarse 
solver. Formally, this introduces a complication in the analysis as the solution 



space becomes a continuum (see 3l|, Remark 3.1]) but we have not found 



any benefits in forcing the solution to stay in Z;^. However, for fractional 
states, the propensities should explicitly be set to zero whenever executing 
the reaction would result in a negative copy number. 



5.2 Stochastic toggle switch 

A toggle switch found within the regulatory network of E. coli has been mod- 
eled by two mutually cooperatively repressing gene products X and Y [19 
The model is 



^^1^ X ^^1^ Y 

X ^ F ^ 



(5.3) 



with parameters a = 3000, b = 11000 and fi = 10^^. In (15. 3p . note that if the 
number of X-molecules is larger than the number of F-molecules, then the 
production of y-molecules is inhibited and the system finds a stable state 
with X > y. However, the intrinsic noise can make the roles of X and Y 
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suddenly switch to a state where instead the production of X-molecules is 
inhibited. The system (15.31) thus constitutes an interesting example where 
the deterministic dynamics clearly differs from that obtained by a stochastic 
simulation. 

In Figure 15.11 the exact trajectory up to final time T = 5 x 10^ is dis- 
played together with the approximation obtained after a few iterations by 
the parareal algorithm on a very coarse grid with At = T/50 = 10^. Interest- 
ingly, the next correct place to switch is found for each new iteration so that 
all such events are correctly located within the first 4 iterations. Although 
the numerical solution sometimes overshoots, some information is evidently 
being correctly propagated through the system. 




012345 012345 



Figure 5.1: Dashed: exact trajectory of the two components of the toggle 
switch (15. 3 p versus time. Solid: solution obtained in parallel after 0, 1, 2 and 
4 iterations. Top left: the initial solution from the reaction rate equations 
immediately settles at one of the stable states. Top right and bottom left: 
the first and second switches are correctly obtained after respectively one 
and two iterations. Bottom right: all 4 spontaneous changes of state have 
been correctly detected. Note the rather large level of noise. 
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The relative errors and residuals are plotted in Figure I5.2[ It is seen that 
the convergence initially is very rapid but then reaches a plateau where the 
error stays within a small fraction of the order of the noise level. The accuracy 
thus obtained is quite reasonable and is at the order of a 1% perturbation of 
the propensities in (15.31) . 




Figure 5.2: Maximum relative errors and residuals (see (15. ip and (15. 2p ) ob- 
tained for the toggle switch during the first 20 iterations. The accuracy is 
quickly improved during the first 5 iterations and then settles more slowly. 
The horizontal line (dash-dot) is the induced difference when the propensities 
are perturbed by ±1%. 



5.3 Homogenization of disparate rates 

As a specific model containing two different scales we consider fast dimeriza- 
tion combined with slow isomerization, 





1/e 


X2 -\- X2 




1 


Y2 


Y2 + Y2 


1/s 


Yi + Y, 



where the small parameter e controls the difference in scales. For this example 
we took e = 10~^ and initial data [xi, X2, yi, 2/2](0) = [15, 5, 30, 10] with final 
time T = 10. The fast channels thus fire about 10^ times more often than the 
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slow ones and we note also that the fast channels have quadratic propensities 
(e.g. Wi = Xi{xi — l)/e) so that the rate equations are inexact. 

For the parareal discretization we used At = T/50 as before and an ex- 
tremely simple coarse solver in the form of a single step with the linearized 
backward Euler method. The homogenization procedure described in Sec- 
tion was used with the fine solver defined in fl4.30l) using St = At/2. The 
resulting combination was very effective indeed in obtaining a homogenized 
solution, see Figure [531 and [5^ 




Figure 5.3: Dashed: homogenized solution for (15. 4p versus time. Solid: so- 
lution after (top) and 1 (bottom) iteration of the parareal algorithm. For 
comparison, the corresponding non-homogenized trajectory is also plotted 
using a dotted line. For this example, a single step of the backward Eu- 
ler method was a sufficient accurate coarse scale solver which therefore is 
extremely cheap to evaluate. 
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Figure 5.4: Convergence history for ( 15.41) homogenized according to (14.301) . 



5.4 Stochastic reaction-diffusion 

There are many interesting chemical systems where the spatial distribution 
of the species must be taken into account. The usual thermodynamical equi- 
libria assumption ( "well stirredness" ) no longer holds as the transport of the 
molecules through the solvent is slow compared to average reaction times or 
when some reactions are strongly localized. The length scale over which the 
system can be regarded as homogeneous is now much shorter. 

Examples of when both the stochasticity of the reactions and the spatial 
distribution are necessary to explain experimental data are found in 10, 



121 . 1161 . I30I . I37l |. For such systems, the diffusion at a molecular level can 
be treated as a special set of linear propensities. This yields the reaction- 
diffusion master equation (RDME) [18, Ch. 8], [29, Ch. XIV] which evolves 
the probability density of the system in the same manner as the CME f l2.7l) . 
However, the dimensionality of the state-space is much higher and computing 
even a single trajectory can be a very computationally intensive problem. 
Note also that, as the rate equations now form the reaction-diffusion PDE, 
the master operator defines a continuous spectrum of scales so that a clear 
separation into slow/fast ones is no longer possible. 

We shall consider a small example in the present section as follows. For 
i = 1 ... 5, denote by the triplet {x, y, z)i the number of Y- and Z- 
molecules in cell i. The cells each have volume Vt and for simplicity are 
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connected in an array 1-2-3-4-5. Within each cell i we specify the reactions 




(5.5) 



compare f l2.2l) -( l23i) . Furthermore, the molecules can diffuse to any neigh- 
boring cell j = i±l G {1,...,5} according to 





X, 


dyi/h'^ 
Yi > 




dzijh'^ 

Zi > 


z, 



(5.6) 



where h = is the length-scale. Finally, there are also in- and outflow at 
the boundaries, 



dNnlh'^ „ „ dzxlh'^ 
> Ai Zi > 

dNn/h^ „ dz,,/h^ 
^ Zk > 



(5.7) 



In order to fully describe the system we have chosen constants to ap- 
proximately coincide with those in l3] for E. coli, see also 0]. The model 



is thus parameterized by the integer Nq defining the total volume through 
V = 10~^^ X Nq/25 = 5Q (l). The rate constants are given by ka = 10^ 
{M-^s~^), kd = 10 (s"i) and d = 10"^° (m^s"^). Since we have that Nn oc fi, 
for convenience, we loosely refer to both quantities as the "system size" . The 
model is normalized around Nq = 25 molecules per cell and as initial data 
we simply took (xj, yi, Zi) = {Nq, iVn, 0) in all cells. 

For the parareal algorithm we again took N = 50 intervals with total 
time T = 1. In the semi-discrete case, one can show that the reaction-rate 
equations for the diffusion part fl5.6p are equivalent to a mass-lumped FEM- 
method for the macroscopic diffusion equation; this observation has been used 



to generalize the mesoscopic model to more complicated geometries ^J] . We 
note in passing that this setting opens up for extremely efficient coarse-grain 
solvers based on multigrid techniques when the spatial resolution increases. 

In Figure 15.51 the error during each parareal step is displayed. There 
is a trend with faster convergence as the system size increases and despite 
the seemingly rather large error levels, the solution obtained is visually very 
pleasing and can hardly be distinguished from the exact one (cf . Figure 15.70 . 
In Figure 15.61 we have used instead the homogenized fine scale solver fl4.30p 
and the convergence improves a lot. Again, smoothing the output from the 
fine solver by integrating over a short period of time has the effect of scaling 
down the effective Lipschitz constant and convergence to the homogenized 
solution becomes fast. 
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System size Q. 



Figure 5.5: Relative error per the first few parareal iterations when the sys- 
tem size Q increases. 

6 Conclusions 

The master equation, or equivalently, a special type of jump SDEs accu- 
rately describes well stirred chemical reactions at the mesoscopic level. Dif- 
fusion can be incorporated by a special set of reactions implying that non- 
homogenous problems also can be modeled using the same type of process. 
In the macroscopic limit the usual rate equations (or the reaction- diffusion 
PDE) emerges. 

The parareal algorithm can be applied by using the rate equations as a 
predictor for the jump process. Convergence is then dictated by the Lipschitz 
constant of the system, but also by the level of noise in the solution. This 
noise vanishes in the macroscopic limit. 

For models involving rapid transients the best one can generally hope for 
is convergence in moment, but it is then not clear how the error should be 
monitored. A remedy is to homogenize the model on the fly by averaging 
the process in time. This homogenization is very general and should be 
applicable to a wide range of stochastic time-dependent problems. We remark 
that obtaining this homogenized solution is impractical on a serial computer 
since the full solution must be obtained first. Furthermore, the proposed 
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Figure 5.6: As in Figure [575] but using a homogenized fine solver (14.301) with 
5t = At/4. The convergence to the homogenized solution is faster and more 
regular. 
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Figure 5.7: Sample solution for the reaction-diffusion system flS.SI] . fl5.6p 
and fl5.7l) . The system-size parameter is for this example A^^ = 100 molecules 
per cell. Solid: solution obtained after 4 parareal iterations over 50 proces- 
sors, dashed and barely visible: exact trajectory. From top to bottom are the 
trajectories for the X-, the Y- and the Z-molecules, respectively. Towards the 
end of the simulation there is a strong negative correlation between X- and 
F-molecules and there is also a spatial correlation due to the unsymmetric 
inflow CT . 
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homogenization can be put in contrast to other multiscale methods |7|, [ill] . 
Typically one computes a solution to a homogenized equation obtained by 
determining some kind of averaged coefficients. With the parareal algorithm, 
one rather directly obtains a homogenized solution. 

The range of applicability of the method can be increased by adding a 
white-noise term to the governing equations, thus encompassing highly gen- 
eral Levy-type SDEs In this general setting, the suggested homogeniza- 
tion provides a way of applying the parareal algorithm to stiff problems. 

We would also like to relate our results to those in reference |5|] mentioned 
in the introduction. There it is shown that the parareal/Euler forward com- 
bination for a nonstijf Wiener SDE yields a mean square estimate O (At^/^) 
for the kth parareal iteration assuming a practically exact fine solver. This 
holds true in the present context as well provided that we use the appro- 
priate version of the forward Euler method, namely the tau-leap method 
[Ml]- However, the resulting method is highly nontrivial, if possible at all, 
to implement efficiently. The reason is that the same realizations of Poisson 
processes have to be used for both the coarse and the fine solver. The list 
of events simulated by the fine solver must somehow be reused and searched 
through leading to a very expensive coarse solver. An open question is thus 
if one can somehow circumvent this issue. 

One of the most promising applications of the proposed method seems to 
be for react ion- diffusion models such as the one investigated in Section 15. 4[ 
Here the macroscopic model can typically be acceptable in many, but not 
all, subvolumes. Some of the species are typically present in fairly large 
copy numbers where noise is less pronounced. A few iterations of the sug- 
gested parareal algorithm can thus be understood as a kind of determinis- 
tic/stochastic hybrid method. A feature with this set-up is that one never 
needs to explicitly determine what parts should be treated deterministically. 
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